Regular Path Queries (RPQs) are a type of graph query where answers are pairsof nodes connected by a sequence of edges matching a regular expression. Westudy the techniques to process such queries on a distributed graph of data.While many techniques assume the location of each data element (node or edge)is known, when the components of the distributed system are autonomous, thedata will be arbitrarily distributed. As the different query processingstrategies are equivalently costly in the worst case, we isolatequery-dependent cost factors and present a method to choose between strategies,using new query cost estimation techniques. We evaluate our techniques usingmeaningful queries on biomedical data.
展开▼